Goto

Collaborating Authors

 rank-1 accuracy



FedKLPR: Personalized Federated Learning for Person Re-Identification with Adaptive Pruning

Yu, Po-Hsien, Tseng, Yu-Syuan, Chien, Shao-Yi

arXiv.org Artificial Intelligence

KL-Divergence Regularization Loss (KLL): We introduce a regularization loss function based on KL-divergence that explicitly measures and minimizes the probabilistic divergence between local and personalized model distributions. This theoretically grounded approach effectively prevents model drift while preserving the statistical characteristics of distributed client data, overcoming the limitations of conventional cosine similarity metrics. KL-Divergence-Prune Weighted Aggregation (KLPW A): We introduce a novel aggregation strategy that integrates KL-divergence-based distributional similarity, KL-Divergence-aggregation Weight (KLA W), and client-specific pruning ratios, Pruning-ratio-aggregation Weight (PRA W), into a unified weighting mechanism. This approach dynamically prioritizes clients that exhibit stronger alignment with the global model while contributing compact, efficiently pruned models. By jointly considering statistical consistency and model sparsity, KLPW A surpasses traditional aggregation methods in handling non-IID data distributions and substantially reduces communication costs. Sparse Activation Skipping (SAS): We present a mechanism of skipping pruned parameters during aggregation to enable the global model to be updated only with essential information. Cross-Round Recovery (CRR): To mitigate severe accuracy degradation caused by model pruning, we introduce the CRR, a two-stage pruning strategy. The CRR enables more precise decisions on whether to perform pruning, thus maintaining model accuracy after pruning. The remainder of this paper is organized as follows: Section II introduces related works and background of unsupervised federated person ReID.


Dual-Granularity Cross-Modal Identity Association for Weakly-Supervised Text-to-Person Image Matching

Zhang, Yafei, Shang, Yongle, Li, Huafeng

arXiv.org Artificial Intelligence

Weakly supervised text-to-person image matching, as a crucial approach to reducing models' reliance on large-scale manually labeled samples, holds significant research value. However, existing methods struggle to predict complex one-to-many identity relationships, severely limiting performance improvements. To address this challenge, we propose a local-and-global dual-granularity identity association mechanism. Specifically, at the local level, we explicitly establish cross-modal identity relationships within a batch, reinforcing identity constraints across different modalities and enabling the model to better capture subtle differences and correlations. At the global level, we construct a dynamic cross-modal identity association network with the visual modality as the anchor and introduce a confidence-based dynamic adjustment mechanism, effectively enhancing the model's ability to identify weakly associated samples while improving overall sensitivity. Additionally, we propose an information-asymmetric sample pair construction method combined with consistency learning to tackle hard sample mining and enhance model robustness. Experimental results demonstrate that the proposed method substantially boosts cross-modal matching accuracy, providing an efficient and practical solution for text-to-person image matching.


Enhancing Long-Term Re-Identification Robustness Using Synthetic Data: A Comparative Analysis

Pionzewski, Christian, Rademacher, Rebecca, Rutinowski, Jérôme, Ponikarov, Antonia, Matzke, Stephan, Chilla, Tim, Schreynemackers, Pia, Kirchheim, Alice

arXiv.org Artificial Intelligence

This contribution explores the impact of synthetic training data usage and the prediction of material wear and aging in the context of re-identification. Different experimental setups and gallery set expanding strategies are tested, analyzing their impact on performance over time for aging re-identification subjects. Using a continuously updating gallery, we were able to increase our mean Rank-1 accuracy by 24%, as material aging was taken into account step by step. In addition, using models trained with 10% artificial training data, Rank-1 accuracy could be increased by up to 13%, in comparison to a model trained on only real-world data, significantly boosting generalized performance on hold-out data. Finally, this work introduces a novel, open-source re-identification dataset, pallet-block-2696. This dataset contains 2,696 images of Euro pallets, taken over a period of 4 months. During this time, natural aging processes occurred and some of the pallets were damaged during their usage. These wear and tear processes significantly changed the appearance of the pallets, providing a dataset that can be used to generate synthetically aged pallets or other wooden materials.


On the Effectiveness of Heterogeneous Ensemble Methods for Re-identification

Klüttermann, Simon, Rutinowski, Jérôme, Nguyen, Anh, Grimme, Britta, Roidl, Moritz, Müller, Emmanuel

arXiv.org Artificial Intelligence

In this contribution, we introduce a novel ensemble method for the re-identification of industrial entities, using images of chipwood pallets and galvanized metal plates as dataset examples. Our algorithms replace commonly used, complex siamese neural networks with an ensemble of simplified, rudimentary models, providing wider applicability, especially in hardware-restricted scenarios. Each ensemble sub-model uses different types of extracted features of the given data as its input, allowing for the creation of effective ensembles in a fraction of the training duration needed for more complex state-of-the-art models. We reach state-of-the-art performance at our task, with a Rank-1 accuracy of over 77% and a Rank-10 accuracy of over 99%, and introduce five distinct feature extraction approaches, and study their combination using different ensemble methods.


Quantum learning and essential cognition under the traction of meta-characteristics in an open world

Wang, Jin, Song, Changlin

arXiv.org Artificial Intelligence

Artificial intelligence has made significant progress in the Close World problem, being able to accurately recognize old knowledge through training and classification. However, AI faces significant challenges in the Open World problem, as it involves a new and unknown exploration journey. AI is not inherently proactive in exploration, and its challenge lies in not knowing how to approach and adapt to the unknown world. How do humans acquire knowledge of the unknown world. Humans identify new knowledge through intrinsic cognition. In the process of recognizing new colors, the cognitive cues are different from known color features and involve hue, saturation, brightness, and other characteristics. When AI encounters objects with different features in the new world, it faces another challenge: where are the distinguishing features between influential features of new and old objects? AI often mistakes a new world's brown bear for a known dog because it has not learned the differences in feature distributions between knowledge systems. This is because things in the new and old worlds have different units and dimensions for their features. This paper proposes an open-world model and elemental feature system that focuses on fundamentally recognizing the distribution differences in objective features between the new and old worlds. The quantum tunneling effect of learning ability in the new and old worlds is realized through the tractive force of meta-characteristic. The outstanding performance of the model system in learning new knowledge (using pedestrian re-identification datasets as an example) demonstrates that AI has acquired the ability to recognize the new world with an accuracy of $96.71\%$ at most and has gained the capability to explore new knowledge, similar to humans.


Walking fingerprinting

Koffman, Lily, Crainiceanu, Ciprian, Leroux, Andrew

arXiv.org Machine Learning

We consider the problem of predicting an individual's identity from accelerometry data collected during walking. In a previous paper we introduced an approach that transforms the accelerometry time series into an image by constructing its complete empirical autocorrelation distribution. Predictors derived by partitioning this image into grid cells were used in logistic regression to predict individuals. Here we: (1) implement machine learning methods for prediction using the grid cell-derived predictors; (2) derive inferential methods to screen for the most predictive grid cells; and (3) develop a novel multivariate functional regression model that avoids partitioning of the predictor space into cells. Prediction methods are compared on two open source data sets: (1) accelerometry data collected from $32$ individuals walking on a $1.06$ kilometer path; and (2) accelerometry data collected from six repetitions of walking on a $20$ meter path on two separate occasions at least one week apart for $153$ study participants. In the $32$-individual study, all methods achieve at least $95$% rank-1 accuracy, while in the $153$-individual study, accuracy varies from $41$% to $98$%, depending on the method and prediction task. Methods provide insights into why some individuals are easier to predict than others.


Spatial-temporal Vehicle Re-identification

Kim, Hye-Geun, Na, YouKyoung, Joe, Hae-Won, Moon, Yong-Hyuk, Cho, Yeong-Jun

arXiv.org Artificial Intelligence

Vehicle re-identification (ReID) in a large-scale camera network is important in public safety, traffic control, and security. However, due to the appearance ambiguities of vehicle, the previous appearance-based ReID methods often fail to track vehicle across multiple cameras. To overcome the challenge, we propose a spatial-temporal vehicle ReID framework that estimates reliable camera network topology based on the adaptive Parzen window method and optimally combines the appearance and spatial-temporal similarities through the fusion network. Based on the proposed methods, we performed superior performance on the public dataset (VeRi776) by 99.64% of rank-1 accuracy. The experimental results support that utilizing spatial and temporal information for ReID can leverage the accuracy of appearance-based methods and effectively deal with appearance ambiguities.


Large-scale Training Data Search for Object Re-identification

Yao, Yue, Lei, Huan, Gedeon, Tom, Zheng, Liang

arXiv.org Artificial Intelligence

We consider a scenario where we have access to the target domain, but cannot afford on-the-fly training data annotation, and instead would like to construct an alternative training set from a large-scale data pool such that a competitive model can be obtained. We propose a search and pruning (SnP) solution to this training data search problem, tailored to object re-identification (re-ID), an application aiming to match the same object captured by different cameras. Specifically, the search stage identifies and merges clusters of source identities which exhibit similar distributions with the target domain. The second stage, subject to a budget, then selects identities and their images from the Stage I output, to control the size of the resulting training set for efficient training. The two steps provide us with training sets 80\% smaller than the source pool while achieving a similar or even higher re-ID accuracy. These training sets are also shown to be superior to a few existing search methods such as random sampling and greedy sampling under the same budget on training data size. If we release the budget, training sets resulting from the first stage alone allow even higher re-ID accuracy. We provide interesting discussions on the specificity of our method to the re-ID problem and particularly its role in bridging the re-ID domain gap. The code is available at https://github.com/yorkeyao/SnP.


Heterogeneous Visible-Thermal and Visible-Infrared Face Recognition using Unit-Class Loss and Cross-Modality Discriminator

Cheema, Usman, Ahmad, Mobeen, Han, Dongil, Moon, Seungbin

arXiv.org Artificial Intelligence

Abstract--Visible-to-thermal face image matching is a challenging variate of cross-modality recognition. The challenge lies in the large modality gap and low correlation between visible and thermal modalities. Existing approaches employ image preprocessing, feature extraction, or common subspace projection, which are independent problems in themselves. In this paper, we propose an end-to-end framework for cross-modal face recognition. The proposed algorithm aims to learn identity-discriminative features from unprocessed facial images and identify cross-modal image pairs. A novel Unit-Class Loss is proposed for preserving identity information while discarding modality information. In addition, a Cross-Modality Discriminator block is proposed for integrating image-pair classification capability into the network. The proposed network can be used to extract modality-independent vector representations or a matching-pair classification for test images. Our cross-modality face recognition experiments on five independent databases demonstrate that the proposed method achieves marked improvement over existing state-of-the-art methods. The applications of facial recognition (FR) systems have increased exponentially with the advent of deep convolutional neural networks. Automated FR is being used in personal devices, public surveillance, access control, security, marketing, and other applications. FR rates on visible images have increased considerably in the past few years. However, there are limitations in using FR in scenarios involving extreme variations in illumination, expressions, pose, presentation attacks, and disguises [1-3]. Extravisible imaging technologies are being adapted to overcome the limitations of visible imaging.